Skip to content

Conversation

@ujfalusi
Copy link
Collaborator

Hi,

inital (draft) support for compress offload with IPC4 on Intel platforms (MTL+).

There are still missing features, and few areas to improve, tune, but it works reasonably well and the hope is that the IPC3 compr is still functional.

@dbaluta, if you could be able to test that, we would really appreciate it.

@ujfalusi
Copy link
Collaborator Author

I don't see these errors with gcc (GCC) 15.2.1 20251112 locally, hrm.

@ujfalusi ujfalusi force-pushed the peter/topic/ipc4_compr branch from cea54e7 to c3f4c47 Compare January 14, 2026 14:08
@ujfalusi ujfalusi changed the title ASoC: SOF: Initial support for IPCV4 compress offload ASoC: SOF: Initial support for IPC4 compress offload Jan 19, 2026
@dbaluta
Copy link
Collaborator

dbaluta commented Jan 19, 2026

Will test this on IMX platforms, but we are only using IPC3 so not sure how much it helps. I'm not working on moving IMX to IPC4 but it will take a while.

Will be back with the results asap.

@ujfalusi
Copy link
Collaborator Author

Will test this on IMX platforms, but we are only using IPC3 so not sure how much it helps. I'm not working on moving IMX to IPC4 but it will take a while.

@dbaluta, we want to make sure that we don't break iMX (IPC3) and we are on reverse grounds that we can only test the new IPC4 support.

Will be back with the results asap.

Thank you!

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, some minor opens, some rebasing/squashing probably needed.

if (!fe->fe_compr) {
be_substream->runtime = fe_substream->runtime;
} else {
be_substream->runtime = kzalloc(sizeof(*be_substream->runtime), GFP_KERNEL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason why we dont allocate this in the same logic as non compressed i.e. to balance the conditional logic here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FE is not a PCM, there we don't have runtime at all, but BE is running as a PCM.

struct sof_ipc4_timestamp_info {
struct sof_ipc4_copier *host_copier;
struct sof_ipc4_copier *dai_copier;
u64 stream_start_offset;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw do we subtract any media header from our positions or does cplay need to include this ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't and we don't need to, this is the offset when the first real audio data left the DSP.
The media header is eaten up if it is eaten up by the decoder and it does not matter what the data was.
In compr there is no relation between what the host copies and what is played out.


spcm->stream[dir].cstream = cstream;
spcm->stream[dir].posn.host_posn = 0;
spcm->stream[dir].posn.dai_posn = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I assume our dai position will exclude the header as that's not rendered, but host position may include header ? Just wondering if cplay cares.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't use the dai_posn in ipc4 and there is no meaning of header in this regard. The DAI is running in PCM mode.

/* unprepare and free the list of DAPM widgets */
sof_widget_list_unprepare(sdev, spcm, dir);

cancel_work_sync(&spcm->stream[dir].period_elapsed_work);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be cancel_work_flush_sync() before we free the widgets ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrm, might be better, but this is where the compr has been already stopped and we just report elapsed to ALSA.
I'll take a look.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is following how normal PCM is handled, I will leave it like this as well.

cstream->dma_buffer.dev.type = SNDRV_DMA_TYPE_DEV_SG;
cstream->dma_buffer.dev.dev = dev;

ret = snd_compr_malloc_pages(cstream, crtd->buffer_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, is there any size refinement here like with PCMs or is cplay request taken as-is ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That I'm not sure, but we place a limit on frag size and number in later patch.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The info in get_caps are 'hints' for user space and it can set whatever (cplay does this), so I added checks for buffer_size and number of fragments to avoid invalid configuration.

/*
* stream_start_offset is updated to memory window by FW based on
* pipeline statistics and it may be invalid if host query happens before
* the statistics is complete. And it will not change after the first initiailization.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may need a comment on whether header is included or not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Header has no meaning in here, this is pure PCM data, exactly like normal PCMs, we count the samples played out.


out:
caps->direction = cstream->direction;
caps->min_fragment_size = 3 * 1024;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on any particular format ? Maybe worth a comment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gut feeling ;) I will add defines :D

@dbaluta
Copy link
Collaborator

dbaluta commented Jan 20, 2026

@ujfalusi Tested this today with sof-dev branch and I get some problems but could be unrelated to the change. Need to check if we have some internal patches not upstreamed.

@dbaluta
Copy link
Collaborator

dbaluta commented Jan 21, 2026

@ujfalusi Tested the patches with IMX8MP using IPC3 compress implementation with MP3 and AAC and it works fine. I will add my comments to the code later.

caps->max_fragment_size = 16 * 1024;
caps->min_fragments = 4;
caps->max_fragments = 8;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to upstream a similar change for IPC3 and got this comments:

https://lists.openwall.net/linux-kernel/2022/01/19/64

So how would we know which formats are supported?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dbaluta we're adding the capability in the firmware for the decoder to advertise the codecs supported. Please stay tuned for the firmware side patches.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get the information from fw on what codecs are supported, so this should be fine, it shall cover cases with different fw configurations.

@ujfalusi ujfalusi force-pushed the peter/topic/ipc4_compr branch from c90259d to 7c5f0df Compare January 22, 2026 08:35
ujfalusi and others added 10 commits January 22, 2026 10:36
With DPCM when compr is used on FE side, the BE is still running as
'normal' PCM.
It is expected that the be_substream->runtime on the BE is not NULL, for
example some codec drivers expect to have the substream->runtime valid, to
store configuration for example.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
In preparation for adding support for compressed offload support for
IPC4, rename the current compress implementation with the IPC3 prefix.
Introduce a new field in struct sof_ipc_pcm_ops to save the
IPC-specific compressed ops pointer. This should be set when the
component driver ops are assigned during SOF device probe. Expose a couple
of common functions that will be used by both IPC-specific implementations
and rename the compress.c file to ipc3-compress.c

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
These are common functions that will also be needed for the IPC4
compressed support.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
In order to reuse the pipeline triggering logic for compressed support
with IPC4, modify the signature of the trigger and hw_free PCM IPC ops
so that they can be reused.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
…ng stream

After the host DMA IID is released, reset the curr_pos to 0 for a clean
start for subsequent stream starts. This is not needed for PCM streams
but for compressed streams.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Add new ops in the struct snd_sof_ops for platform-specific ops for
compresssed streams. Also, define and set them for the HDA platforms.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Co-developed-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
The SOF_IPC4_MOD_INIT_DATA_ID_MODULE_DATA type within the module_init_ext
area is module specific init data.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
…dules

Add support for handling init_ext_module_data for process modules, which is
going to be used by decoder and encoder type of process mdoules.
The support is generic and it can be extended to other type of process
modules or other module types than process with a small update of
sof_ipc4_add_init_ext_module_data() function.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
…cm.c

The support for compressed stream will also need to have access to the
same information which is used for delay reporting for DAI data
progression tracking.

Make the necessary struct and functions to be available and premare them to
be called without a valid substream.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
…_params

SOF_INFO (id == 35) tuple holds tuple structured information about SOF
features.

The first entry in SOF_INFO is the SOF_CODEC_INFO (id == 0) which contains
information about the supported codecs for decode/encode in the booted
firmware.

If present in the fw_config payload, make a copy of it and store it
sof_ipc4_fw_data->codec_info to be used by the compressed code.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
@ujfalusi ujfalusi force-pushed the peter/topic/ipc4_compr branch 2 times, most recently from 228cd7c to a57957e Compare January 22, 2026 12:36
@ujfalusi
Copy link
Collaborator Author

Changes since v2:

  • rewrite how we report the min fragment size to handle deep buffer on host side

@ujfalusi ujfalusi force-pushed the peter/topic/ipc4_compr branch 2 times, most recently from 52d57d7 to f992ff0 Compare January 22, 2026 15:04
@ujfalusi
Copy link
Collaborator Author

Changes since v3:

  • fix ERROR: modpost: "__udivdi3" [sound/soc/sof/snd-sof.ko] undefined! in x86_32 build

@ujfalusi ujfalusi force-pushed the peter/topic/ipc4_compr branch from f992ff0 to f16f40d Compare January 23, 2026 08:37
ranj063 and others added 3 commits January 23, 2026 10:37
Set and define the compressed ops for IPC4.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Co-developed-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
There is no need to select IPC3 and IPC4 along with INTEL_CNL as INTEL_CNL
selects both.

Similarly, INTEL_MTL selects IPC4, so there is no need to do that for LNL,
PTL and NVL.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Select SOF_COMPRESS for TGL and newer platforms, on Intel devices the
compressed support is available with IPC4 only.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Decoder and encoder modules fall under process modules in SOF.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Co-developed-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
@ujfalusi
Copy link
Collaborator Author

Changes since v4:
Drop the min_fragment_size storage in sof_compr_stream and remove the use of it in ipc4 code:
It was allocated and stored behind cstream->runtime->private_data, but that pointer is also used by the Intel platform code and it was overtaken on open.
So, (1) we were leaking memory, (2) we were accessing cstream->runtime->private_data data which points to a completely different struct.

It is a surprise that this did not blow up, but it somehow managed to 'work', just that the min_fragment_size turned 0 after the compr_open (as we used the data for read only).

Added a helper to calculate on spot the minimal fragment size instead, it is used only during open operation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants